AITopics | energy distance

Collaborating Authors

energy distance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

8 Supplementary Material

Neural Information Processing SystemsApr-25-2026, 09:21:32 GMT

Calculation of T Given data D, disaggregate Y into M equal-size bins, and the m-th bin is denoted as Bm. Let m = |Bm| denote the number of samples in Bm. For distribution p 2 (V A Y) conditioned on y in Bm, pV,A|ym, pV|ym and pA|ym are denoted as the joint distribution of (V,A), marginal distribution of V and A, respectively. As detailed in Section 5.1 of [33] and Algorithm 4 of [32], Um could be calculated through U-statistic. Specifically, in [33], they consider designing kernel as ij(av)= I(Ai = a,Vi = v) I(Ai = a)I(Vi = v), for i and j-th sample in Dt.

artificial intelligence, machine learning, prediction interval, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

Beyond Expected Information Gain: Stable Bayesian Optimal Experimental Design with Integral Probability Metrics and Plug-and-Play Extensions

Wu, Di, Liang, Ling, Yang, Haizhao

arXiv.org Machine LearningApr-24-2026

Bayesian Optimal Experimental Design (BOED) provides a rigorous framework for decision-making tasks in which data acquisition is often the critical bottleneck, especially in resource-constrained settings. Traditionally, BOED typically selects designs by maximizing expected information gain (EIG), commonly defined through the Kullback-Leibler (KL) divergence. However, classical evaluation of EIG often involves challenging nested expectations, and even advanced variational methods leave the underlying log-density-ratio objective unchanged. As a result, support mismatch, tail underestimation, and rare-event sensitivity remain intrinsic concerns for KL-based BOED. To address these fundamental bottlenecks, we introduce an IPM-based BOED framework that replaces density-based divergences with integral probability metrics (IPMs), including the Wasserstein distance, Maximum Mean Discrepancy, and Energy Distance, resulting in a highly flexible plug-and-play BOED framework. We establish theoretical guarantees showing that IPM-based utilities provide stronger geometry-aware stability under surrogate-model error and prior misspecification than classical EIG-based utilities. We also validate the proposed framework empirically, demonstrating that IPM-based designs yield highly concentrated credible sets. Furthermore, by extending the same sample-based BOED template in a plug-and-play manner to geometry-aware discrepancies beyond the IPM class, illustrated by a neural optimal transport estimator, we achieve accurate optimal designs in high-dimensional settings where conventional nested Monte Carlo estimators and advanced variational methods fail.

artificial intelligence, experimental design, machine learning, (13 more...)

arXiv.org Machine Learning

2604.21849

Country:

North America > United States > Maryland > Prince George's County > College Park (0.04)
North America > United States > Tennessee > Knox County > Knoxville (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Generative Modeling under Non-Monotonic MAR Missingness via Approximate Wasserstein Gradient Flows

Kremling, Gitte, Näf, Jeffrey, Lederer, Johannes

arXiv.org Machine LearningApr-7-2026

The prevalence of missing values in data science poses a substantial risk to any further analyses. Despite a wealth of research, principled nonparametric methods to deal with general non-monotone missingness are still scarce. Instead, ad-hoc imputation methods are often used, for which it remains unclear whether the correct distribution can be recovered. In this paper, we propose FLOWGEM, a principled iterative method for generating a complete dataset from a dataset with values Missing at Random (MAR). Motivated by convergence results of the ignoring maximum likelihood estimator, our approach minimizes the expected Kullback-Leibler (KL) divergence between the observed data distribution and the distribution of the generated sample over different missingness patterns. To minimize the KL divergence, we employ a discretized particle evolution of the corresponding Wasserstein Gradient Flow, where the velocity field is approximated using a local linear estimator of the density ratio. This construction yields a data generation scheme that iteratively transports an initial particle ensemble toward the target distribution. Simulation studies and real-data benchmarks demonstrate that FLOWGEM achieves state-of-the-art performance across a range of settings, including the challenging case of non-monotonic MAR mechanisms. Together, these results position FLOWGEM as a principled and practical alternative to existing imputation methods, and a decisive step towards closing the gap between theoretical rigor and empirical performance.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2604.04567

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Deep Generative Markov State Models

Hao Wu, Andreas Mardt, Luca Pasquali, Frank Noe

Neural Information Processing SystemsFeb-14-2026, 22:31:43 GMT

We propose a deep generative Markov State Model (DeepGenMSM) learningframework for inference of metastable dynamical systems and prediction of tra-jectories.

artificial intelligence, configuration, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Germany (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)

Add feedback

8 Supplementary Material 8.1 Details and Proofs for the Proposed EOC 8.1.1 Calculation of T Given data D

Neural Information Processing SystemsFeb-8-2026, 09:36:26 GMT

Fourier transform of a power of a Euclidean distance, i.e., According to Jensen's inequality and Lipschitzness assumption, we have X According to the law of total probability and Theorem 4.1, we have P { Y A feasible solution to Equation (1) can be quickly found. Pseudocode for Algorithm 2 The pseudocode for the constrained optimization is detailed in Algorithm 2. 18 Algorithm 2 Robust Optimization Method with EOC Constraint Input: Initiate Array A of shape 2 A M that stores the max possible step. Our proposed algorithm is highly computationally efficient.

artificial intelligence, machine learning, prediction interval, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.54)

Add feedback

Masking criteria for selecting an imputation model

Yang, Yanjiao, Suen, Daniel, Chen, Yen-Chi

arXiv.org Machine LearningNov-14-2025

Missing data is a common problem across various scientific disciplines, including medical research (Bell et al., 2014), social sciences (Molenberghs et al., 2014), and astronomy (Ivezi c et al., 2020). To handle missing entries in the dataset, imputation (Grzesiak et al., 2025; Kim and Shao, 2021; Little and Rubin, 2019) is a popular approach that is widely accepted in practice. An imputation model generates plausible values for each missing entry, transforming an incomplete dataset into a complete one. The critical importance of this task has led to the development of a wide array of imputation models, grounded in various modeling assumptions. These range from traditional approaches like hot-deck imputation (Little and Rubin, 2019) to more sophisticated methods such as Multiple Imputation via Chained Equations (MICE; V an Buuren and Groothuis-Oudshoorn 2011), random forest imputation (Stekhoven and Bühlmann, 2012), techniques based on Markov assumptions on graphs (Y ang and Chen, 2025), and even generative adversarial networks (Y oon et al., 2018). Despite the proliferation of imputation models, the selection of an optimal imputation model for a given dataset remains a significant challenge, largely due to the unsupervised nature of the problem. Among the many proposed strategies for evaluating and selecting imputation models, masking has emerged as a particularly popular procedure (Gelman et al., 1998; Honaker et al., 2011; Leek et al., 2012; Qian et al., 2024; Troyanskaya et al., 2001; Wang et al., 2024). Masking involves intentionally creating missing values in observed entries to create a setting where imputation accuracy can be measured against a known ground truth. This approach has demonstrated remarkable success and power in other domains, notably in language modeling (Devlin et al., 2019; Y ang et al., 2019) and image recognition (Hondru et al., 2025; Vincent et al., 2010; Xie et al., 2022) and prediction-powered inference (Angelopoulos et al., 2023; Wang et al., 2020).

imputation model, machine learning, pattern recognition, (18 more...)

arXiv.org Machine Learning

2511.10048

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.47)

Add feedback

kNNSampler: Stochastic Imputations for Recovering Missing Value Distributions

Pashmchi, Parastoo, Benoit, Jerome, Kanagawa, Motonobu

arXiv.org Machine LearningSep-11-2025

We study a missing-value imputation method, termed kNNSampler, that imputes a given unit's missing response by randomly sampling from the observed responses of the $k$ most similar units to the given unit in terms of the observed covariates. This method can sample unknown missing values from their distributions, quantify the uncertainties of missing values, and be readily used for multiple imputation. Unlike popular kNNImputer, which estimates the conditional mean of a missing response given an observed covariate, kNNSampler is theoretically shown to estimate the conditional distribution of a missing response given an observed covariate. Experiments demonstrate its effectiveness in recovering the distribution of missing values. The code for kNNSampler is made publicly available (https://github.com/SAP/knn-sampler).

conditional distribution, imputation, knnsampler, (15 more...)

arXiv.org Machine Learning

2509.08366

Country:

North America > United States (0.14)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.05)
Europe > France (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Meaning-infused grammar: Gradient Acceptability Shapes the Geometric Representations of Constructions in LLMs

Rakshit, Supantho, Goldberg, Adele

arXiv.org Artificial IntelligenceSep-10-2025

The usage-based constructionist (UCx) approach to language posits that language comprises a network of learned form-meaning pairings (constructions) whose use is largely determined by their meanings or functions, requiring them to be graded and probabilistic. This study investigates whether the internal representations in Large Language Models (LLMs) reflect the proposed function-infused gradience. We analyze representations of the English Double Object (DO) and Prepositional Object (PO) constructions in Pythia-$1.4$B, using a dataset of $5000$ sentence pairs systematically varied by human-rated preference strength for DO or PO. Geometric analyses show that the separability between the two constructions' representations, as measured by energy distance or Jensen-Shannon divergence, is systematically modulated by gradient preference strength, which depends on lexical and functional properties of sentences. That is, more prototypical exemplars of each construction occupy more distinct regions in activation space, compared to sentences that could have equally well have occured in either construction. These results provide evidence that LLMs learn rich, meaning-infused, graded representations of constructions and offer support for geometric measures for representations in LLMs.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2507.22286

Country:

Europe (0.28)
North America > United States (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Evaluating the Efficiency of Latent Spaces via the Coupling-Matrix

Yavuz, Mehmet Can, Yanikoglu, Berrin

arXiv.org Artificial IntelligenceSep-9-2025

A central challenge in representation learning is constructing latent embeddings that are both expressive and efficient. In practice, deep networks often produce redundant latent spaces where multiple coordinates encode overlapping information, reducing effective capacity and hindering generalization. Standard metrics such as accuracy or reconstruction loss provide only indirect evidence of such redundancy and cannot isolate it as a failure mode. We introduce a redundancy index, denoted rho(C), that directly quantifies inter-dimensional dependencies by analyzing coupling matrices derived from latent representations and comparing their off-diagonal statistics against a normal distribution via energy distance. The result is a compact, interpretable, and statistically grounded measure of representational quality. We validate rho(C) across discriminative and generative settings on MNIST variants, Fashion-MNIST, CIFAR-10, and CIFAR-100, spanning multiple architectures and hyperparameter optimization strategies. Empirically, low rho(C) reliably predicts high classification accuracy or low reconstruction error, while elevated redundancy is associated with performance collapse. Estimator reliability grows with latent dimension, yielding natural lower bounds for reliable analysis. We further show that Tree-structured Parzen Estimators (TPE) preferentially explore low-rho regions, suggesting that rho(C) can guide neural architecture search and serve as a redundancy-aware regularization target. By exposing redundancy as a universal bottleneck across models and tasks, rho(C) offers both a theoretical lens and a practical tool for evaluating and improving the efficiency of learned representations.

artificial intelligence, machine learning, redundancy, (17 more...)

arXiv.org Artificial Intelligence

2509.06314

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Weighted Support Points from Random Measures: An Interpretable Alternative for Generative Modeling

Zhao, Peiqi, Rodríguez, Carlos E., Mena, Ramsés H., Walker, Stephen G.

arXiv.org Machine LearningSep-1-2025

Support points summarize a large dataset through a smaller set of representative points that can be used for data operations, such as Monte Carlo integration, without requiring access to the full dataset. In this sense, support points offer a compact yet informative representation of the original data. We build on this idea to introduce a generative modeling framework based on random weighted support points, where the randomness arises from a weighting scheme inspired by the Dirichlet process and the Bayesian bootstrap. The proposed method generates diverse and interpretable sample sets from a fixed dataset, without relying on probabilistic modeling assumptions or neural network architectures. We present the theoretical formulation of the method and develop an efficient optimization algorithm based on the Convex--Concave Procedure (CCP). Empirical results on the MNIST and CelebA-HQ datasets show that our approach produces high-quality and diverse outputs at a fraction of the computational cost of black-box alternatives such as Generative Adversarial Networks (GANs) or Denoising Diffusion Probabilistic Models (DDPMs). These results suggest that random weighted support points offer a principled, scalable, and interpretable alternative for generative modeling. A key feature is their ability to produce genuinely interpolative samples that preserve underlying data structure.

artificial intelligence, machine learning, support point, (17 more...)

arXiv.org Machine Learning

2508.21255

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > Mexico (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback